Improving on Bagging with Input Smearing

نویسندگان

Eibe Frank

Bernhard Pfahringer

چکیده

Bagging is an ensemble learning method that has proved to be a useful tool in the arsenal of machine learning practitioners. Commonly applied in conjunction with decision tree learners to build an ensemble of decision trees, it often leads to reduced errors in the predictions when compared to using a single tree. A single tree is built from a training set of size N . Bagging is based on the idea that, ideally, we would like to eliminate the variance due to a particular training set by combining trees built from all training sets of size N . However, in practice, only one training set is available, and bagging simulates this platonic method by sampling with replacement from the original training data to form new training sets. In this paper we pursue the idea of sampling from a kernel density estimator of the underlying distribution to form new training sets, in addition to sampling from the data itself. This can be viewed as “smearing out” the resampled training data to generate new datasets, and the amount of “smear” is controlled by a parameter. We show that the resulting method, called “input smearing”, can lead to improved results when compared to bagging. We present results for both classification and regression problems.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improving reservoir rock classification in heterogeneous carbonates using boosting and bagging strategies: A case study of early Triassic carbonates of coastal Fars, south Iran

An accurate reservoir characterization is a crucial task for the development of quantitative geological models and reservoir simulation. In the present research work, a novel view is presented on the reservoir characterization using the advantages of thin section image analysis and intelligent classification algorithms. The proposed methodology comprises three main steps. First, four classes of...

متن کامل

Integrating Instance Selection and Bagging Ensemble using a Genetic Algorithm

Ensemble classification combines individually trained classifiers to obtain more accurate predictions than individual classifiers alone. Ensemble techniques are very useful for improving the generalizability of the classifier. Bagging is the method used most commonly for constructing ensemble classifiers. In bagging, different training data subsets are drawn randomly with replacement from the o...

متن کامل

Investigating the Effect of Underlying Fabric on the Bagging Behaviour of Denim Fabrics (RESEARCH NOTE)

Underlying fabrics can change the appearance, function and quality of the garment, and also add so much longevity of the garment. Nowadays, with the increasing use of various types of fabrics in the garment industry, their resistance to bagging is of great importance with the aim of determining the effectiveness of textiles under various forces. The current paper investigated the effect of unde...

متن کامل

Improving Adaptive Bagging Methods for Evolving Data Streams

We propose two new improvements for bagging methods on evolving data streams. Recently, two new variants of Bagging were proposed: ADWIN Bagging and Adaptive-Size Hoeffding Tree (ASHT) Bagging. ASHT Bagging uses trees of different sizes, and ADWIN Bagging uses ADWIN as a change detector to decide when to discard underperforming ensemble members. We improve ADWIN Bagging using Hoeffding Adaptive...

متن کامل

Attribute bagging: improving accuracy of classifier ensembles by using random feature subsets

We present attribute bagging (AB), a technique for improving the accuracy and stability of classi#er ensembles induced using random subsets of features. AB is a wrapper method that can be used with any learning algorithm. It establishes an appropriate attribute subset size and then randomly selects subsets of features, creating projections of the training set on which the ensemble classi#ers ar...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2006

Improving on Bagging with Input Smearing

نویسندگان

چکیده

منابع مشابه

Improving reservoir rock classification in heterogeneous carbonates using boosting and bagging strategies: A case study of early Triassic carbonates of coastal Fars, south Iran

Integrating Instance Selection and Bagging Ensemble using a Genetic Algorithm

Investigating the Effect of Underlying Fabric on the Bagging Behaviour of Denim Fabrics (RESEARCH NOTE)

Improving Adaptive Bagging Methods for Evolving Data Streams

Attribute bagging: improving accuracy of classifier ensembles by using random feature subsets

عنوان ژورنال:

اشتراک گذاری